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Abstract 

Multivariate spatial field data are increasingly common and whose modeling typ¬ 
ically relies on building cross-covariance functions to describe cross-process relation¬ 
ships. An alternative viewpoint is to model the matrix of spectral measures. We 
develop the notions of coherence, phase and gain for multidimensional stationary pro¬ 
cesses. Coherence, as a function of frequency, can be seen to be a measure of linear 
relationship between two spatial processes at that frequency band. We use the coher¬ 
ence function to illustrate fundamental limitations on a number of previously proposed 
constructions for multivariate processes, suggesting these options are not viable for 
real data. We also give natural interpretations to cross-covariance parameters of the 
Matern class, where the smoothness indexes dependence at low frequencies while the 
range parameter can imply dependence at low or high frequencies. Estimation follows 
from smoothed multivariate periodogram matrices. We illustrate the estimation and 
interpretation of these functions on two datasets, forecast and reanalysis sea level pres¬ 
sure and geopotential heights over the equatorial region. Examining these functions 
lends insight that would otherwise be difficult to detect and model using standard 
cross-covariance formulations. 

Keywords: coherency; gain; multivariate random field; periodogram; 

PHASE; REANALYSIS; SQUARED COHERENCE; SPECTRAL DENSITY 


1 Introduction 


The theory of univariate continuous stochastic processes has become well developed over 
nearly a century of research. The past quarter century or so has seen an increasing interest 
and development of models for multivariate spatial processes. The recent review by |Ge n- 


ton and Kleiber (2015) gives a relatively comprehensive treatment of the basic approaches 
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that have been explored to build stochastic spatial models. In the discussion, Beyilacqua 


et al. (2015) pose the question, given the recent deluge of multivariate constructions, “which 


parametric model is more flexible?” Indeed, the relative strengths and weaknesses of multi¬ 
variate models have been explored only empirically, that is, by testing a battery of different 
models on particular datasets, and comparing performance either by likelihood values or 
by predictive cross-validation (in the multivariate context, spatial prediction is known as 
co-kriging). Thus, a fundamental open question is: (1) to what extent can the flexibilities of 
model constructions be compared theoretically? Additionally, most models are motivated in 
the covariance domain, and the natural follow-up question is: (2) are there other approaches 
than covariance to measure and quantify spatial dependence? 


We introduce the notion of spectral coherence, phase and gain for multidimensional and 
multivariate spatial random fields. We propose that these functions allow for natural partial 
answers to the critical open questions (1) and (2). We show that a number of previously 
proposed models lack sufficient practical flexibility in terms of prediction, and we suggest 
insights into well established models such as the multivariate Matern, where parameters such 
as the cross-covariance smoothness and range have had elusive direct interpretations that 
relate to process behavior. 

Let us illustrate the ideas developed in this manuscript by considering the time series 
case first. Suppose Z^t), i — 1, 2 is a bivariate complex-valued weakly stationary time series 
on t G M with covariance functions Cov(Z k (t + h),Z k (t)) = C k k(h) and cross-covariance 
functions Co v(Z k {t + h), Z e (t)) = C k z[t) for k ^ l. The corresponding spectral densities are 
fki{u) = (27t) _1 f R Cki(h) exp(— iu>h)dh for w e 1. Spectral modeling in time series is well 
developed; for example, traditional autoregressive moving average models imply processes 
with rational spectral densities. The function 

= IfeMP 

/11M/22M 

is known as the squared coherence function, and can be interpreted as a quantification of 
the linear relationship between Z\(t) and Z 2 (f) at frequency oj (Brockwell and Davis, 2009). 

We entertain two example datasets from the atmospheric sciences. Both are reforecast 
and reanalysis data products over the equatorial region based on a well established numer- 


(Brockwell and Davis, 2009 
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ical weather prediction (NWP) model. Reanalysis forecasts are from a fixed version of a 
NWP model that are run retrospectively to generate a large database of model forecasts 
and analyses (in this context, an analysis can be considered a best estimate of the current 
state of the atmosphere). First we look at forecasted surfaces of sea level pressure at daily 
forecast horizons between 24 and 192 hours. We show that coherence can be used as a 
diagnostic to assess forecast quality, and additionally illustrate frequency bands at which 
forecasts improve over time. The second dataset involves geopotential heights at differing 
pressure levels. We show that coherence and phase extract and highlight qualities of the 
spatial relationship between different pressure levels that are difficult to model using extant 
multivariate covariance constructions, and indeed illustrate some fundamental limitations of 
existing popular constructions. 

2 Spectra for Multivariate Random Fields 

Suppose Z(s) = (Zi(s),. .., Z p (s)) T G C p is a p-variate weakly stationary random held 
on s E R d admitting a matrix-valued covariance function C(h) = (Cij(h))fj =1 where 
Cij( h) = Cov(Z*(s + h), Zj( s)). For simplicity of exposition we suppose Z(s) is a mean zero 
process. For complex-valued stationary processes, Cov(Zj(si), Z,(s 2 )) = E(Zj(si)Z,-(s 2 )), so 
that Cij( h) = Cji( h). The main obstacle to multivariate process modeling is developing flex¬ 
ible classes of matrix-valued covariance functions C that are nonnegative definite. We say 
C is nonnegative def ini te if, for any choices of a t k G C and locations s*, G R d for i = 1,... ,p 
and k — 1,..., n we have 

p p n n 

EEEE — O’ 

i =1 j= 1 k =1 t= 1 

Note this reduces to the usual definition of nonnegative definiteness for a univariate covari¬ 
ance, p — 1. 

For univariate processes, Bochner’s Theorem states that Ca( h) is a valid (i.e., nonnegative 
definite) function if and only if it can be written 

(7(h) = f exp(?'aj T h)cLF(a;) 

JR d 
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where F is a positive finite measure (Stein, 1999). If F admits a density / with respect to the 
Lebesgue measure on M d , we call it the spectral density for C. The multivariate extension of 
Bochner’s fundamental result is given by Cramer ( |194 0), and is contained in the following 
theorem specialized to covariances admitting spectral densities. 


Theorem 1 (Cramer 1940). A matrix-valued function C : —» C pxp , C = (C'^)f J=1 is 

nonnegative definite if and only if 



exp(iw T h),/=(u>)du; 


for i,j = 1 ,... ,p such that the matrix f(w) = (fij(&))i j=i is nonnegative definite for all 

uj e R d . 


The functions fij{&) are the spectral and cross-spectral densities for the marginal and 
cross-covariance functions C'y(h). Note that /^(cn) = ff (uj). When the spectral density 
exists, it can be solved for as the Fourier transform of the covariance function, 

fa(u) = tww [ exp(-^ T h)C ij (h)dh. 

( Z7r ) JR d 

Theorem [l] has primarily been used in practice to build multivariate covariance models, by 
specifying matrices of spectral densities that are nonnegative definite for all frequencies. 


2.1 Coherence 

In time series, the notion of frequency coherence is well developed, and can be used, for 
instance, to assess whether one time series is related to another by a time invariant linear 
filter. These notions carry over to the spatial case, and form the point of entry for our 
analyses. 

If Cij(h),i,j = 1,2 form a matrix-valued covariance function with associated spectral 
densities /^(cn), then define the coherence function (or coherency function) 

fl2 

\/ /ll(w)/22(w) 

We might assume fu(oj) > 0 for all w 6 l d for j = 1,2, but can define 7 ( 0 ;) = 0 if 
fn{oj) = 0. The coherence function can be complex-valued, so in practice we examine 
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the absolute coherence function, | 7 (cj)|. The real-valued function | 7 (cj )| 2 is the squared 
coherence function, and by Theorem [lj 0 < ( 7 ( 0;)| 2 < 1 for all co. Values of | 7 (w)j near 
unity indicate a linear relationship between Z\{s) and Z 2 (s) at particular frequency bands. 


The following theorem relates the coherence to optimal prediction of a random process 
based on another process. The predictive estimator is based on a kernel smoothed process 
which is a natural predictor given the interpretation of the univariate kriging weights as a 
kernel function (Kleiber and Nychka, 2015). 


Theorem 2. Suppose (Zi(s), Z 2 (s)) T is a complex-valued mean zero weakly stationary bi¬ 
variate field with matrix-valued covariance C(h) admitting a spectral density matrix f(u>) = 
(/jj(u?)) 2 J=1 that is everywhere nonzero. Then the continuous square integrable function 
Ii( u) : R d —> C that minimizes E|Z 1 (s 0 ) — f Rd K {u — s 0 )Z 2 (u)du| 2 is 


K( u) 





7(u>)da;. 


( 1 ) 


Additionally, the spectral density of the predictor Zi( s o) = f R d K(u — s 0 )Z 2 (u)du is 


/i| 2 (w) = /11MI7MI 2 


( 2 ) 


for all co E M d . 


In particular, the relationship (|TJ) implies that the optimal weighting function is modulated 
by the coherence between the two processes, and indeed has greater spectral weight on 
frequencies with high coherence. An immediate corollary to Theorem [2] is 


ItMI 2 = 


flpjv) 
AiM' 


Thus, the coherence has an attractive interpretation as the amount of variability that can 
be attributed to a linear relationship between two processes at a particular frequency. In the 
following development, we use the coherence function to illuminate fundamental limitations 
on some popular multivariate covariance constructions. 


The coherence function can be used as a tool to compare proposed multivariate models, as 
an indicator of the amount of flexibility of bivariate relationships at differing frequencies. For 
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example, a rather classic approach to specifying covariances is separability, setting C(h) = 
RC'(h) where C'(h) is a univariate covariance function and R is a p x p positive definite 
matrix ( |Mardia and Goodall 1993; Helterbrand and Cressie, 1994 Bhat et al. 2010). This 
approach has been empirically shown to be insufficiently flexible, and the following lemma 
contributes to the empirical results. 


Lemma 3. If C(h) = RC(h) where C : —> M is a covariance function and R is a p x p 
positive definite matrix with ( i,j)th entry r^, then the squared coherence between the ith and 
jth process is constant, in particular = ( r ij r ji)/{ r u r jj ). 


A more sophisticated method of generating multivariate covariance structures is to con¬ 


volve univariate square integrable functions (Gaspari and Cohn 1999 Oliver, 2003 Gaspari 


et al. 2006 Majumdar and Gelfand, 2007). In particular, if c* 


->■ 


are square in¬ 


tegrate functions for i = 1,... ,p then Cy(h) = (c* * Cj)(h) is a valid matrix-covariance 
function where * denotes the convolution operator. This is sometimes known as covariance 
convolution (especially when <7 are positive definite functions to begin with). The following 
proposition suggests this approach to model building is overly-restrictive, and indeed implies 
that the resulting coherence is necessarily constant over all frequencies. 


Proposition 4. If c\ and C 2 are square integrable functions on M. d and a matrix-valued 
covariance is defined via C = Ci-kCj for i,j = 1,2 where * denotes convolution, then 
7 ( 0 ;) = 1 for all cj e such that the Fourier transforms of Ci and Cj are nonzero. 


Multivariate processes can sometimes be modeled as being related by local averaging. 
For example, the relationship between column integrated ozone observations and local ozone 
might be plausibly modeled as observations being locally averaged over the true underlying 


held (Cressie and Johannesson, 2008). Wind observations are often time averaged over 


moving windows to produce smoother and more stable observation series (Hering et al. 


2015). The following proposition characterizes the coherence in such situations, and serves 


to illustrate the intimate link between process relationship and coherence. 


Proposition 5. If Z i(s) is a weakly stationary stochastic process and Z 2 (s) = f Rd K(u — 
s)zd(u)du for some continuous square integrable kernel function K : —>• M that is sym¬ 
metric, then 'y(io) = 1 for all lo G such that the Fourier transform of K is nonzero. 
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According to Proposition [5j estimated coherences near unity over all frequency bands may 
be indicative of a linear or local averaged relationship between processes, and this result may 


serve as the theoretical basis for testing such a hypothesis. Fuentes (2006) uses a similar 
notion to develop a test for separability of space-time processes. 


The kernel convolution method, introduced by Ver Hoef and Barry (1998) and Ver Hoef 


et al. (2004), originally involved representing a process as a moving average against a white 


noise process. In simple cases this yields the covariance convolution model. This can be 
generalized to 


Z k { s) = / £fc(x - s)VF(x)dx 


( 3 ) 


where VF(x) is a mean zero stationary process with covariance (7(h), and : M. d —> R 
are square integrable symmetric kernel functions for k — 1 ,... ,p. As with each previous 
construction, this approach also yields constant coherence. 


Proposition 6. If Z\ (s) and Z 2 (s) are constructed as in 0. then 7 (w) is constant for all 

uj e R d . 


For all of these models, separable, covariance convolution and kernel convolution, the 
resulting multivariate structure is restricted to constant coherence. In light of Theorem [2j 
this suggests that none of these models can attain optimal prediction for any multivariate 
processes exhibiting nontrivial coherences. Indeed the examples below in Section [4] exhibit 
nonconstant coherence, and call for more flexible modeling frameworks. 


2.2 Phase and Gain 


Similar notions to frequency coherence can be motivated by examining spectral density 
matrices. If (Zi(s), Z 2 (s)) T is a stationary random vector with spectral density matrix 
(fij{^))ij=i then define A(u) = /i 2 (a;)//n(a;). Note that A(u>) is possibly complex-valued. 
We define the gain function G(uj) = |A(u>)| which is sometimes referred to as the gain of 


Z 2 (s) on Zi(s) in time series (Brockwcll and Davis, 2009). Additionally define the phase 


function at frequency as = arg A(cj), the complex argument of A(uj). Note that the 
phase function satisfies 6 (— 7r, 7 t] and <f>{—u>) = —0(cj). 
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The interpretations of gain and phase are most clear when considering processes built by 
the relationship Zi(s) = aZ 2 (s — u) for some u G and a ^ 0, i.e., Z\ is a shifted and 
rescaled version of Z 2 . Then it is straightforward to show that the phase function is 


0M = 


-u> T u (mod 27 t), a > 0 
n — uz T u (mod 2n) a < 0. 


Li and Zhang (2011) develop an approach to modeling this type of asymmetric cross¬ 


covariance behavior. This result shows that their construction will have phase shift function 
that depends on the angle a/ r u. This result may be used as an exploratory data approach 
or as the basis for a statistical test to suggest whether a pair of spatial processes exhibit 
an asymmetric relationship; Li and Zhang (201lJ use the empirical cross-correlation func¬ 
tion to visually assess such asymmetric behavior. The gain function in this case is simply 
G{uj) = |a|; all frequency components of Z 2 are exaggerated by an amount a for Z x . 


Below we consider some multivariate constructions that are particular to real-valued pro¬ 
cesses having real-valued spectral matrices. Any model with real-valued cross-spectral den¬ 
sity has 0(u) = 0, but a possibly non-trivial gain function. Thus, testing for cp(u >) = 0 can 
be viewed as a test for a real-valued cross-spectral density, which seems very relevant given 
most multivariate models are developed under this assumption. 


2.3 Revisiting the Multivariate Matern 


The multivariate Matern is a model for matrix-valued covariance functions such that each 
process is marginally described by a Matern covariance function, and all cross-covariance 


functions also fall in the Matern class (Gneiting et al. 2010 Apanasovich et al. 2012). 


Specifically, the multivariate Matern imposes Cu( h) = of M(h | z/*, a*) for i = j and CV, ( h) = 
p ij a i a j M(h\u ij ,a ij ) for 1 < i ^ j < p. Here, M(h | u, a) = (2 1_i/ /r(z/))(a||h||) y K l/ (a||h||) 


where K u is a modified Bessel function of the second kind of order v. Gneiting et al. (2010) 


and Apanasovich et al. (2012) discuss restrictions on the parameters v t , a t , , a ZJ and pij 

that result in a valid model. 


The Matern class is popular due to the smoothness parameters v > 0 which continuously 
index smoothnesses of the sample paths of the process. In particular, sample paths are m 

























times differentiable if and only if v > m, and there is an additional relationship between 


v and the fractal dimension in that sample paths have dimension rna x(d,d + 1 — v) (Goff 


and Jordan, 1988 Handcock and Stein. 1993). These interpretations and implications also 


hold in the multivariate case, where zy indexes the smoothness of the ith component process 
Zi( s). The parameters a, act as range parameters, and control the rate of decay of spatial 
correlation away from the origin. 


A standing issue with the multivariate Matern is that the cross-covariance parameters, 
Uij and dij for i ^ j, do not have straightforward interpretations that are analogous to the 
marginal smoothness and range interpretations, and indeed nowhere in the literature have 
these parameters been linked directly to process behavior. We find that these parameters 
have direct interpretations when considering the coherence function between two processes. 


The coherence function for a bivariate process with multivariate Matern correlation is 

2 _ 2 r(z/ 12 + d/2) 2 r(z/ 1 )T(z/ 2 ) a% 12 (a 2 + ||u>|| 2 ;p +d / 2 (a 2 + \\u:\\ 2 )^ +d / 2 

7M “ P T(z/! + d/2)Y(u 2 + d/2)Y{u l2 y a 2 ^a 2 p (a 2 2 + |M| 2 ) 2 ^+ d 

We explore in detail two simplified versions of this coherency function. First, consider the 
case where a± = a 2 = a 12 = a, all covariance and cross-covariance functions share a common 
range. Then 


7M 2 


2 Y(u 12 + d/2) 2 Y(u 1 )Y(u 2 ) 

P T(z / 1 + d/2)Y(is 2 + d/2)Y(u 12 f 


(«+INI| 2 ) 


^ 1 +^ 2 — 2^12 


(4) 


The multivariate Matern is a valid model only if z / 12 > {y\ + v 2 )/2, and thus by inspecting 
Q we note some modeling implications resulting from these restrictions. First, if Ui 2 = 
(yi + v 2 )/2, the coherency is constant across all frequencies, implying a constant linear 
relationship between two processes at all frequency bands. Second, if zq 2 > {yi + ^ 2 )/2, we 
have greater coherency at low frequencies, with 7 ( 0 ;) —)• 0 as ||u;|| —^ 00 . This analysis seems 
to suggest a natural interpretation of the cross-covariance smoothness in that it controls the 
amount of cross-process dependence at various frequencies, but can only imply greater or 
equal coherence at low frequencies versus high frequencies. Figure [T| serves to illustrate these 
points, showing various coherence functions for common length scale parameters an = a± 2 = 
a 2 2 = 1 , varying i / 12 and p. 
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Figure 1: Coherence functions for various bivariate Materns with an = ai 2 = a 22 = 1, zqi = 
U 22 = 0.5, varying zq 2 and p. 

Perhaps surprisingly, a similar analysis suggests the cross-covariance range parameter 
ai 2 induces potentially greater flexibility in the coherence function than the smoothness 
parameter zq 2 . In particular, if zq = zq = ^12 = zq the coherence function is 


7 (ca) 2 = p 2 


U2 


2 v 


(a 2 + ||uz|| 2 )(a 2 + ||u;|| 2 ) 


is-\-d/ 2 


( 5 ) 


apa 2 J \ (a 2 2 + ||ca|| 2 ) 2 

Examining ([5]), we see that, depending on whether a i2 < min(ai,a 2 ) or ai 2 > max(ai,a 2 ) 
the coherence will have distinct behavior. For simplicity, set a\ = a 2 = a. If ai 2 < a, the 
coherence will be greater for small frequencies than high frequencies, similar to the behavior 
implied by ([I]) when u 12 > (zq + zq)/2. However, here 7(0;) —>■ pa 2 2 /(aia 2 )^ as ||ca|| —> cx), 
implying non-negligible coherence between processes at high frequencies, unlike that in Q. 
If d \2 > a, we have the complementary result that 7 (^ 1 ) < 7 (u> 2 ) for ||cl 7 i|| < ||u; 2 ||, that is, 
two processes share behavior at high frequencies rather than low. Figure [2] illustrates these 
scenarios, and seems to suggest that, at least as far as coherence is concerned, the cross¬ 
covariance range parameter yields potentially greater flexibility than the cross-smoothness. 

The so-called parsimonious Matern model is defined by imposing common range param¬ 


eters as well as zq 2 = (zq + zq)/2 (Gneiting et al. 2010). This model has been empirically 
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Figure 2: Coherence functions for various bivariate Materns with Un = tq 2 = z /22 = l,an = 
a -22 — 1, varying 012 and p. 


shown to produce inferior model fits to datasets as compared to more general versions of the 


multivariate Matern as well as other multivariate classes (Gneiting et al. 2010 Apanasovich 


et ah, 2012). The coherence function for a bivariate parsimonious Matern model is constant, 


7 (w) = p, which suggests an inflexible model for the spectral behavior of spatial processes. 


We close this section with an empirical illustration of the implications of cross-covariance 
parameter choice on random field realizations and the associated low and high frequency 
behavior. We simulate two bivariate Matern models on an equally-spaced grid of 256 x 256 
in [0,8] 2 . In both cases we low-pass and high-pass filter the resulting bivariate field. The 
low-pass filter is a matrix of zeros except for a 3 x 3 grid of 1/9, while the high-pass filter is 
similar with a 3 x 3 grid of —1/9 on the edge and 8/9 in the center. The effect of the filters is 
to remove high frequency behavior (low-pass filtering) or low frequency behavior (high-pass 
filtering). 

Figure [3] shows low-pass filtered realizations of two bivariate Gaussian processes with bi¬ 
variate Matern covariances. The top row is the case with equal ranges an = 012 = 022 = 1 but 
with a greater cross-smoothness un = ^22 = 0.5, 17.2 = 1- According to Figure [lj we should 


11 





















(a) (b) (c) 


w 

A*. 

■ 2 

- -3 

V * 

Variable 1 Filtered 

-06 -04 -0 2 00 02 04 06 

1_1_1_1_1_1_1 

' t/ 

- * • . 

(d) 


(e) 


0.6 -04 -0.2 0.0 0.2 04 06 

Variable 2 Filtered 

(f) 


- 3 

- 2 

% 

ufly * a 

f * Lf 

005 0.10 

_1_1 



: 

- -2 

Variable 1 Filtered 

-010 -0.05 0.00 

_1_1_1_ 

• 


-0.10 -0.05 0.00 0.05 0.10 


Variable 2 Filtered 


Figure 3: Low-pass filtered bivariate Matern with an = ai 2 = <222 = 1, ^11 = n 22 = 0.5, iq 2 = 
1 and p = 0.5 for the (a) first process, (b) second process. Panel (c) is the pairwise scatterplot 
of high-pass filtered values of the two processes with contour levels for comparison. Panels 
(d), (e) and (f) are analogous plots for a bivariate Matern simulation with v> n = z/ 12 = v 22 = 
1, an = a 22 = 1, ai 2 = V2 and p = 0.5. 

expect the realizations to show similar low frequency behavior, but dissimilar high frequency 
behavior. Indeed, panels (a) and (b) show similar low frequency behavior, while the pairwise 
scatterplot of high-passed values, panel (c), suggests little correlation at high frequencies. 
Complementary, the second row is the case with equal smoothnesses, u u = u 12 = z / 21 = 1 
but a greater cross-range parameter, an = a 22 = l,ai 2 = \/2. Figure [ 2 ] suggest we should 
expect less coherence at low frequencies while having greater correlation at high frequen¬ 
cies. Again, these theoretical results are reinforced: panels (d) and (e) are not suggestive of 
strong low frequency coherence, while panel (f) exhibits positively correlated high frequency 
characteristics (and indeed has an empirical correlation coefficient of approximately 0.25). 
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2.4 The Linear Model of Coregionalization 


The linear model of coregionalization (LMC) is a competing framework for multivariate mod¬ 
eling, and is built by decomposing a multivariate process as linear combinations of uncorre¬ 


lated, univariate processes (Goulard and Voltz, 1992; Royle and Berliner, 1999; Wackernagel 


2003 Schmidt and Gelfand[ 2003). In particular, we entertain the following version, 

= BW(s) 


Z s = 


Zi(s) 
H s) 


bn b 12 

^21 b 2 2 


Wi(s) 

w 2 (s) 


( 6 ) 


The matrix B is known as the coregionalization matrix, and controls the strength of depen¬ 
dencies on the latent uncorrelated processes W. For the following, suppose VFi(s) and 
are uncorrelated processes with spectral densities fi(co) and / 2 (cj), respectively. 

Given the number of parameters in the LMC, it is often useful to impose restrictions on 


the coregionalization matrix B, such as setting bn = b 22 = 1 (Berrocal et al. 2010). The 


following Lemma is the unsurprising result that the LMC yields multivariate processes that 
are exactly coherent when the coregionalization matrix has zero determinant. 


Lemma 7. In the linear model of coregionalization ([6|), if b u = b 22 = 1 then the coherence 
function between -Zi(s) and Z 2 ( s) is unity if and only ifbi 2 b 2 \ = 1. 

Note that this result simply states that, under the LMC, two processes are exactly coherent 
when they differ only by a scalar multiplier. 

Under the same working assumptions, bn = b 22 = 1, we have the gain function of Z 2 (s) 
on Zi( s) is 


GM = 


&2i/i(w) + bi 2 f 2 (u>) 


AM +b 2 12 f 2 (u) 

If, as is common in using the LMC, we set b\ 2 = 0, we have the gain function is simply b 2 p 
that is, there is constant gain at all frequencies by the amount of coregionalization, b 2 \. The 
complementary case where b 2 i = 0 yields the gain 

GM = ,, , , 

/i(u>) + b 12 f 2 (u>) 

that is, the relative contribution of component b\ 2 W 2 (s) to the combined spectrum of Z\{ s). 
As mentioned previously, if the latent processes have real-valued spectral densities, the phase 
function is exactly zero at all frequencies. 
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3 Estimation of Spectra 


Suppose Z(s) is a p- variate process that has been observed at a regular grid of points {s*}A 1; 
of marginal dimensions n t , i = 1 where N = nliTn. If grid spacing in the ?'th 

dimension is Si, define 8 = Jl^ =1 Si. Then the spatial periodogram matrix with (k, T)th entry 
is dehned as I(u>) = (I^(u;))^ =1 where 


N 


N 




z k ( Sfc) exp(-isja;) ^ Z e (s k ) exp(-is^) 


(7) 


,fc=i 


,fc=i 


(2n)PN 

is available at Fourier frequencies u> = 2nf where f = (fi/(5ini), ..., /d/(5d^d)) T f° r fi £ 


{-L(wi - 1)/2J,. • • ,rii - \rii/2\}. Note that I k e(uj) = 

Whereas in the time series literature it is natural to consider asymptotics as time t —> oo, 
resulting in effectively uncorrelated blocks of a process, in the spatial realm there are two 
competing asymptotic frameworks. Increasing domain asymptotics is similar to the time 
series case where samples are taken on an ever-increasing domain in all axial directions, and 
typically asymptotic results here echo those in time series. The complementary version is 
infill asymptotics (sometimes called fixed-domain asymptotics) in which the domain bound¬ 


ary is fixed and points are sampled at an ever finer resolution within the domain (Zhang and 


Zimmerman 2005). Depending on the asymptotic framework under consideration, the large 


sample properties of the periodogram (J7|) change. 

Using infill asymptotics, Lim and Stein (2008) show that the raw multivariate periodogram 
can exhibit bias at low frequencies, and suggest prewhitening the process to overcome this 


inadequacy (in the univariate case Stein (1995) gives a simulated example where the bias is 
quite substantial). However, under a mixture of infill and increasing domain asymptotics, 


Fuentes (2002) showed (for univariate processes) the analogous result to the time series case 


that the periodogram is asymptotically unbiased and is uncorrelated at differing Fourier 
frequencies. Additionally, in this latter case it is not a consistent estimator, but must be 
smoothed to gain consistency. 


Analogous to the time series and univariate spatial held case, under certain assumptions 
the nonparametric periodogram ([7]) is asymptotically unbiased, and generates asymptotically 
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uncorrelated random variables between distinct Fourier frequencies. The following theorem 
illustrates this feature of the matrix-valued periodogram. We use the same assumptions to 


Fuentes (2002), generalized to the multivariate setting. 


Al The true spectral densities /^(u?) decay as ||u>|| t ,t > 2 as ||u>|| —> oo, u> G M 2 . 
A 2 The marginal and cross-covariances satisfy f ||h|||Cfc^(h)|dh < oo, h £ M 2 . 

A3 Si —* 0, rti —y oo and 8,711 00 for all i, j — 1 ,. .., d such that rti/rtj —>• X tJ > 0. 

Theorem 8. Under the assumptions A1-A3, we have 


(i) —> fke(co), 

(ii) Vailke{u) = f ke {u)) 2 and 

(in) Cov( 4 £ (uq), 4 £ (oj 2 )) 0 foru l ^ w 2 . 


The proof for Theorem [ 8 ] follows directly from Fuentes (2002) and is not included here. 


According to Theorem [ 8 j the matrix-valued periodogram is not an asymptotically con¬ 
sistent estimator. To produce a consistent estimator of the spectral density at a particular 
frequency w 0 , in practice we locally smooth adjacent periodogram values and appeal to 
property (iii) of Theorem [ 8 j In particular, the smoothed matrix-valued periodogram is 

hi{u Q ) = / K\(u> — u}o)I k e(u})dF n (u>) ( 8 ) 

J R d 


where F n (uj) is the empirical cumulative distribution function of Fourier frequencies {u 
Here, K\ is some kernel function with bandwidth A, where, as we have it written, the same 
kernel is applied to each process. Naturally, different kernels may be used for different 
processes if the scientific context calls for such an approach. 


Note that we can’t directly use the nonparametric periodogram fraction to estimate the 
coherence as / fc ^(u?)4fe(u;) = at all Fourier frequencies. Thus, we estimate the 

coherence functions by using the smoothed periodograms, 

|4iMI 2 

Ikk(u)Iee(uj 



for k, £ = 1 ,... ,p. 
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4 Illustrations 


We examine two datasets from the atmospheric sciences, gridded reforecasts and reanalyses 
of sea level pressure and geopotential heights over the equatorial region. Reforecast data are 
produced retrospectively from a fixed version of a numerical weather prediction model, in 
this case the 2nd generation National Oceanic and Atmospheric Administration’s (NOAA) 


Global Ensemble Forecast System Reforecast (Hamill et al., 2013). Forecasts are generated 
at 3 hour increments from 0 to 192 hours, with the Oh forecast being a reanalysis, that is, 
an estimate of the current state of the atmosphere. For the data below, the control initial 
conditions were produced using a hybrid ensemble Kalman filter-variational analysis system 


(Hamill et al. 2011). 


4.1 Sea Level Pressure 

The first dataset we consider is a set of reforecast sea level pressures (SLP) over the equatorial 
region. Sea level pressures in this region are approximately stationary, and we compare 
forecast horizons in 24 hour increments from Oh to 192h (8 days out). The data consist of 
gridded reforecasts from the Erst 90 days of 2014 at 1° increments over 360 longitude and 
47 latitude bands between —23° to 23° defining the equatorial region. 

One approach to examining the quality of forecasts is the coherence between the forecast 
with the corresponding reanalysis. For example, we might compare the 24h forecast of SLP 
generated on January 1, 2014 to the Oh reanalysis generated on January 2, 2014. It is well 
known that forecast skill decays with horizon, and we expect the short-term forecasts to 
share higher coherence with the reanalyses than the long-term forecasts. 

We begin by standardizing each analysis and forecast horizon grid cell by subtracting the 
temporal average and diving by empirical standard standard deviation to produce forecast 
anomalies. Denote these anomalies by Zk(s,d) for forecast horizons k = 0,1,2,..., 8 cor¬ 
responding to forecast horizons 0, 24,..., 192 hours, spatial locations s E V C M 2 in the 
equatorial region D on days d — 1 ,..., 90, i.e. , the first 90 days of 2014. 

Each day’s marginal process empirical periodogram ([7]) is calculated for all forecast hori¬ 
zons fc, yielding d)}. The smoothed periodogram is a convolution with a simple 
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168 hour forecast 


96 hour forecast 


24 hour forecast 



Figure 4: Estimated absolute coherence functions for the GEFS sea level pressure reforecast 
data, comparing the 168, 96 and 24 hour forecast horizons with the zero hour analysis. The 
vertical axis spans [0,1]. 

low-pass filter, a matrix of zeros with a 3 x 3 constant block of 1/9. Interest focuses on 
comparing various forecast horizons with the reanalysis at k — 0, so we calculate empirical 
cross-periodograms {/ofc(<■«■>, d)} for all available days d allowing for forecast validation (e.g., 
the k — 1, 24h horizon, has 89 available days, d = 2,... ,90). The cross-periodograms are 
smoothed using the same low-pass filter as the marginals. If denotes the smoothed 

cross-periodograms, we estimate the squared coherence function as 



for k = 1 ,..., 8, that is, the average over all available daily smoothed cross-periodograms. 


Figure [4] shows estimated absolute coherence functions for horizons 168, 92 and 24h with 


the Oh analysis. Even at long lead lead times there is substantial coherence, which increases 
by a substantial margin at very low longitudinal frequencies. For any given longitude fre¬ 
quency band, the coherence appears to be relatively constant across latitudes, which is 
sensible given that there appears to be greater variability in the equatorial direction than 
in the north-south direction for sea level pressure in this region. As the forecast horizon 
decreases the coherence begins building between low-to-mid frequency bands in the latitudi¬ 
nal direction, suggesting that the statistical characteristics of short term forecasts are more 
similar to observed sea level pressure than the longer term forecasts. However, note that at 
the highest frequencies there is not a substantial improvement in forecast skill, bordering on 
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Tabic 1: Cross-covariance Matern family parameter estimates for GEFS sea level pressure 
reforecast data. 


Forecast horizon 

24 

48 

72 

96 

120 

144 

168 

192 

P 

0.96 

0.94 

0.91 

0.88 

0.85 

0.81 

0.76 

0.73 

«12 

0.079 

0.076 

0.075 

0.073 

0.071 

0.070 

0.068 

0.067 

^12 

1.06 

1.06 

1.05 

1.04 

1.03 

1.02 

1.01 

1.01 


no improvement, which suggests that small scale events are difficult to forecast even at one 
day out. 


To quantify the differences in forecast horizon skill, we estimate a set of bivariate Matern 
models. As each forecast and analysis arises from the same physical model, we assume that 
the marginal spectra follow the same statistical behavior, that is, we suppose the marginal 
spectral densities at equal at all horizons, /oo(<^) = fkk{u). Suggested by Figure [5] and 
exploratory analysis for the marginal spectra, we additionally suppose the spectrum is con¬ 
stant across latitude frequencies. A Gaussianity assumption does not appear to be justifiable, 


based on empirical Q-Q plots. Thus, we follow the suggestion of Fuentes (2002) and estimate 
marginal Matern parameters by minimizing the squared difference between theoretical log 
spectral density and log average periodogram, having averaged over all days, forecast hori¬ 
zons and latitude bands. The resulting least squares estimates are a = 0.074 and v = 0.94. 
Cross-covariance parameters are estimated by minimizing least squares distance to the av¬ 
erage empirical coherence functions, estimated by averaging over days and latitude bands. 


Table [l] contains the cross-covariance parameter estimates corresponding to all forecast 
horizons. As forecast horizon increases, all parameters decay; in fact, the decay is almost 
exactly linear for each variable beyond the 24h horizon. Fitting a linear model to the param¬ 
eters as a function of forecast horizon suggests the cross-covariance smoothness will decay to 
the marginal value v = 0.94 after approximately 16 days, whereas the cross-covariance scale 
meets the marginal value between 3-4 days. Although these ideas can be used to generate 
scientific hypotheses, there is substantial extrapolation for nu \2 that strongly depends on a 
linear assumption. 


A word of caution is in order; the estimates in Table [l] do not always imply a valid 
multivariate covariance structure. However, on any given grid the estimated parameters 
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may yield a valid model, it just cannot be guaranteed for all grids. One possibility is that 
the bivariate Matern is not sufficiently flexible to describe the stochastic structure of these 
fields, which is a call for further research in this area. 


4.2 Geopotential Height 


Our second example is on the same spatial domain, but whose values are geopotential heights. 
Geopotential height is the height (in meters) above sea level at which the atmospheric pres¬ 
sure is a certain level. In the atmospheric sciences, it is common to examine geopotential 


heights as indicators of climatic regimes, for instance Knapp and Yin (1996) examines the 
relationship between heights and temperature anomalies over a portion of the United States. 


Three of the most common geopotential height maps are the 850hPa, 500hPa and 300hPa 
maps. The first, 850hPa, approximately defines the planetary boundary layer, which is the 
lowest level of the atmosphere that interacts with the surface of the Earth (note lOOOhPa is 
approximately sea level). The 300hPa level is at the core of the jet stream, while the 500hPa 
approximately divides the atmosphere in half, and whose anomalies are used in part to 
assess climatological temperature variations. The vertical structure of geopotential heights 


is a focus of some interest within atmospheric sciences (Blackmon et al. 1979). 


We examine geopotential height reanalysis anomalies Zk(s,d) for k = 1,2,3 representing 
the 850hPa, 500hPa and 300hPa pressure levels on days d — 1,..., 181, the first 6 months 
of 2014. The anomalies are differences between the reanalysis height and a time-varying 
Nadaraya-Watson kernel smoothed estimate of the mean with a bandwidth of 5 days. Ex¬ 
periments suggest the results below are qualitatively robust against choices of the bandwidth 
and smoothing kernel. 


Similar to the previous section, we smooth the marginal process periodogram (JtJ) using a 
low-pass filter, and calculate smoothed empirical cross-periodograms yielding {/^(u;, d)}'f- =l . 
Then the squared coherence is estimated as the arithmetic average of each day’s empirical 
squared coherence estimate, 


i ^ 

181 4(w,d)4-(u>,d) 
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Figure 5: Estimated absolute coherence functions for the GEFS geopotential height data 
between (a) 850hPa and 500hPa, (b) 850hPa and 300hPa and (c) 500hPa and 300hPa. 


Figure [5] contains the three estimated pairwise absolute coherence functions. There is 
high coherence between the lower pressure levels at low frequencies, and some evidence 
of moderate coherence between all levels at low frequencies. We also note some strikingly 
different behavior than for the sea level pressure example. First, there is an apparent ridge in 
coherence at low frequencies (approximately 27r9/360) which may be indicative of equatorial 


planetary waves (Wang and Xie, 1996 Xie and Wang, 1996 Kiladis et al. 2009). Planetary 


waves can play crucial roles in the formation of tropical cyclones (Molinari et ah, 2007). 
Additionally, note that there is high coherence at the highest Fourier frequencies for all 
coherence functions (capping out at approximately 0.62,0.54, and 0.70). This is evidence 
of a nonseparable relationship in the frequency domain, and we are unaware of any current 
multivariate models that can adequately capture such behavior. One possible explanation 
for this high coherence at high frequencies is artifacts in the data assimilation scheme, in 
particular aberrant observational data leading to unusually large anomalies in geopotential 
height. Indeed, variables such as sea level pressure are well constrained by a wealth of 
observational data, while geopotential heights are less constrained, usually being observed 
by sparsely released weather balloons. 


Figure [6] shows pairwise plots for each pair of geopotential height anomalies. In particular 
there is strong evidence of a phase shift at a low frequency band between the 850hPa and 
both lower pressure heights. These frequencies indicate wavelengths of approximately 4000 — 
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Figure 6: Estimated phase functions for the GEFS geopotential height data between (a) 
850hPa and 500hPa, (b) 850hPa and 300hPa and (c) 500hPa and 300hPa. 


6000km, which is a typical wavelength for planetary equatorial waves, or Rossby waves 

). On the other hand, there 


Wang and Xie, 

1996 

Xie and Wang, 

1996 

Kiladis et ah, 


is not substantial evidence of phase shift between the pairs of heights at other frequencies. 
Most extant multivariate models utilize real-valued cross-spectral densities, and thus are 
insufficiently flexible to capture this type of phase shifted behavior at specific spectra. 


5 Discussion 


The notion of coherence, phase and gain are common in the time series literature, but have 
been yet unexplored for multivariate spatial processes. We casted these functions for multi¬ 
dimensional processes. The coherence between two variables can be interpreted as a measure 
of linear relationship at particular frequency bands, resulting in a complementary framework 
for comparing processes than the usual cross-covariance function. Phase and gain also yield 
straightforward interpretations as a physical space-shift and relative amplitude of frequency 
dependence when comparing two processes. We developed these ideas for stationary pro¬ 
cesses, and future research may be directed toward the analogous cases for nonstationary 


processes, perhaps extending the work of Fuentes (2002). 


Coherence, phase and gain can be estimated using smoothed cross-periodograms, and 
in our examples we showed that, as exploratory tools, these functions can be very useful 
in detecting structure that may not be readily captured using extant multivariate models. 
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We additionally illustrated that the coherence function gives a natural interpretation to the 
multivariate Matern cross-covariance parameters that have otherwise been uninterpretable, 
lending insight into an outstanding problem. 


A number of future research directions may be considered, including the adaptation of 
coherence to multivariate space-time processes. This work can also be seen as a call to 
develop more flexible multivariate models, perhaps working directly in the spectral domain 
rather than the covariance domain, where most previous work has fallen, echoing the call of 


Simpson et ah (2015). 


Appendix 

The following Lemma is useful in proving some results of the manuscript. 

Lemma 9. Suppose Zi(s) is a stationary processes on with covariance function Ci(h) 
having spectral density fi(uj) and Z 2 ( s) = f K(s — u)Zi(u)du where K is continuous, 
symmetric and square integrable with Fourier transform /^(u;). Then Z 2 ( s) has covari¬ 
ance function C^h) = f f K (u + v — h)K (v)Ci(u)dudv with associated spectral density 
f 2 (uj) = /i(u;)/a'(<i;) 2 . Additionally, the cross-covariance function between Z\ and Z 2 is 
Ci 2 (h) = f K(u — h)Ci(u)du with spectral density fi 2 (uj) = fi(Lj)fpc(u)- 

The proof of this Lemma involves straightforward calculations involving convolutions and 
is not included here. 

We recall the spectral representation for a stationary vector-valued process Z(s) G M p , s G 
M. d with matrix-valued covariance function C(h) having spectral measures F tJ ,i, j = 1 , ,p 
dehned on the Borel cr-algebra B on M d . There is a set of complex random measures M = 
(Mi, ..., M p ) on B such that if B,Bi,B 2 G B are disjoint, E Mi(B) = 0, E (Mi(B)Mj(B)) = 
Fij(B) and E(M i (Bi)Mj(B 2 )) — 0 for i, j — 1,... ,p. Then Z(s) has the spectral represen¬ 
tation 

(s) = j exp(zaj T s)dM(Lj), 

) for details. If all F t] admit associated spectral densities 
fij, then in shorthand we write E(dMj(u?)dMj(u;)) = /^(cu)du;. 


see 


Gihman and Skorohod (1974 
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Proof of Theorem The spectral representation implies 

Zi(s) = I exp T s) dMj. (u>), 

for complex-valued random measures Mj, i — 1,2. Then if A' has Fourier transform Fk , 
J K(u — s)Z 2 (u)du = ^ / A'(u — s) exp(io; T u)dM 2 (u;)du 
= y exp(icj 1 s)AA-(cj)dM 2 (o;) 

by a change of variables. Then, using that fa(uj)duj = E|dMj(u;)| 2 and 


E I / g(uj)dMi(uj) / /i(w)dMj(w) I = / p(o;)/i(c«;)/y(c«;)di 


lu; 


we have 
E 


Zi(s) - y AT(u - s)Z 2 (u)du = y f/n^) - f 12 (u>)F K (u ) - / 21 (u?)F ft: (a;)+ 


F K (co)F K (uj)f 22 ( 00 ) J du> 

= / EldM^w) - F*(u;)dM 2 (u>)| : 


The integrand is minimized for each u; if 


Fk{w) = 


E(dM 1 («)dM 2 («)) /i 2 (w) 


E|dM 2 M| 2 f 22 ( 00 )' 

That the density of f K(u — s)Z 2 (u)du is \fi 2 (oo)\ J /f 22 (oo) now follows by the convolution 


theorem for Fourier transforms. 


□ 


Proof of Proposition If fi(oo) is the Fourier transform of q, i — 1,2, the result immediately 
follows as the spectral density of C'y(h) is /j(u>)/j(uj). □ 

Proof of Proposition [5| This result follows directly from Lemma [9j □ 

Proof of Proposition [6| This result follows from Lemma [9] and that the spectral density for 
white noise is constant over all frequencies. □ 
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